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Abstract — Consider transmission over a binary additive white 
gaussian noise channel using a fixed low-density parity check 
code. We consider the posterior measure over the code bits and 
the corresponding correlation between two codebits, averaged 
over the noise realizations. We show that for low enough noise 
variance this average correlation decays exponentially fast with 
the graph distance between the code bits. One consequence of this 
result is that for low enough noise variance the GEXIT functions 
(further averaged over a standard code ensemble) of the belief 
propagation and optimal decoders are the same. 

I. Introduction 

We consider transmission over a binary additive white 
gaussian noise channel (BIAWGN) using low-density parity 
check codes (LDPC) and the optimal MAP decoder. We are 
interested in the behavior of the correlation between two code 
bits as a function of their graph distance. In [1] we treated 
this problem, for the regime of high noise, for a special code 
ensemble containing a sufficiently large fraction of degree 
one variable nodes. In the present contribution we attack the 
problem in the low noise regime. 

The behavior of correlations between relevant degrees of 
freedom is of central interest in the analysis of Gibbs mea- 
sures, and various approaches have been developed to tackle 
such problems. The Gibbs measures associated with the opti- 
mal decoder of LDPC codes confront us with new challenges 
which invalidate the direct use of the standard methods. For 
example it is easy to see that the standard Dobrushin type 
methods [2] fail due to the presence of hard constraints. In 
the high noise regime we were able to convert the problem 
(at least in the special case of [1]) to a spin glass containing 
a mixture of soft and hard constaints for which appropriate 
cluster expansions can be applied. These expansions have been 
applied to the simpler case of low density generator matrix 
codes (LDGM) in [3] for the high noise regime, which boils 
down to a high temperature spin glass. 

The low noise regime which is our interest here is a truly 
low temperature spin glass problem for which all the above 
methods fail. The general idea of our strategy is to apply a 
duality transformation to the LDPC Gibbs measure. It turns 
out that the dual problem does not correspond to a well defined 
communications problem, and in fact it does not even corre- 
spond to a well defined Gibbs measure because the "weight" 
takes positive as well as negative values. Nevertheless the dual 
problem has the flavor of a high noise LDGM system (or high 



temperature spin glass) and we are able to treat it through 
cluster expansions. There exist a host of such expansions [4], 
but we wish to stress that the simplest ones do not apply to the 
present situation for at least two reasons. The first, is that there 
exist arbitrarily large portions of the dual system which are in a 
low noise (or low temperature) phase with positive probability 
(this is related to the Griffith singularity phenomenon [6]). The 
second, is that the weights of the dual problem are not positive 
so that the method in [3] does not work. It turns out that a 
cluster expansion originaly devised by Berretti [5] is very well 
suited to overcome all these problems. 

Our analysis can also be carried through for a class of 
other channels including the BSC and BEC, but we do 
not give the details here. The case of the BEC is special 
because under duality the Gibbs weight remains positive and 
the communication problem using LDPC codes on BEC(e) 
transforms to a real communication problem using LDGM 
codes on the BEC(1 - e) [7]. 

In the last section we sktech an application of our main 
result to the MAP-GEXIT function (in other words the first 
derivative of the input-output entropy with respect to the noise 
parameter). We prove that in the low noise regime where 
the average correlation decays (fast enough) the MAP-GEXIT 
function can be exactly computed from the density evolution 
analysis. These curves remain non-trivial all the way down to 
zero noise as long as there are degree one variable nodes (e.g 
Poisson LDPC codes). This proves that a non-trivial replica 
solution is the exact expression for the input-output entropy 
of a class of LDPC codes (containing a fraction of degree 
one variable nodes) on the BIAWGN channel. Previously the 
replica expression was only known to be a one-sided bound 
[8], [9] for general ensembles and channels. The equality had 
been obtained previously for some ensembles on the BEC 
using duality [10] and the interpolation method [11]. 

II. Decay of Correlations 

Let x n be a binary codeword of length n from a fixed 
LDPC code with bounded, but otherwise arbitrary variable 
and check node degrees. In the sequel we call l max , r max the 
maximal variable and check degrees. The noise variance of the 
BIAWGN channel is e 2 and y n denotes the received message. 
Assuming without loss of generality that the channel input 
is the all zero codeword, the output can be mapped onto the 
half-log-likelihood ratio li = I m pyj^falo) wriere Py\x(u\ x ) 



is the channel's transition matrix. The channel outputs are 
i.i.d with distribution Py\x(u\Q) which induces a distribution 



Mapping the codewords x n 



to spin configurations a n with <Zj = (— I) 2 *, the posterior 
measure becomes (for a uniform prior) 

px^n n ) =z;ii ehai n 2 (i +in 

i—l c—1 i£c 

In this expression is a product over all the parity check 
constraints of the code and Yiiec * s a P r °duct over variable 
nodes attached to the check node c. The partition function Z P 
is simply the normalizing factor 



Z P = 



(T n e{-i,+i} n »=x 



n m 



c=l 



The average of an arbitrary function /(cr n ) with respect to the 
above measure is denoted as (f(o~ n ))p where the subscript P 
refers to parity check (later we use various other brackets). 
This is still a random quantity which depends on the channel 
output realization. Further averages with respect to the noise 
are denoted by E/n[(/(cr n ))p]. Of course it does not make 
sense to permute the expectation E;n and the bracket (— }p 
because of the normalizing factor Zp in the denominator. 

Our main result is on the average correlation between any 
two codebits defined by 

C P (i,j) = Eln[\(<Ti<Tj)p - (<Ti)p((Tj)p\] 

Theorem 1 (Decay of Correlations): Consider 
transmission over a BIAWGN channel with noise variance e 2 
using an arbitrary fixed LDPC code. Set k = (lmax^max) 1 ' 2 ■ 
Let dist(i, j) denote the graph distance between the codebits 
There exist strictly positive purely numerical constants eo, 
ci, C2 such that for e 2 < e^k -2 (In k)~ x and dist(z, j) > 4l max 
we have 



Cp(i,j) < cie ^ 



dist(ij') 



Remark: By graph distance we mean the smallest possible 
number of edges on a path connecting i and j. 

In fact we will derive (and use in section [V} a slightly more 
general estimate. Suppose that the bits X, are transmitted at 
different noise levels e, < e. Then 



Cp(i,j) < cie 



- c (tt + : 



,- -^<fist(i,j) 



(2) 



where c > is a strictly positive number. In particular if bits 
Xi or xj are perfectly received we recover Cp(i,j) = 0. 

III. Duality Formulas 

A general theory of duality for codes on graphs can be found 
in [14] and references therein. Here we derive by elementary 
means formulas that are useful to us. Let C be a binary parity 
check code and C 1 - its dual. We apply the Poisson summation 
formula 

E /(*") = wr\ E fr") 



where the Fourier (or Hadamard) transform is, 

f(T n )= /(<7 n )e if E?=l(1 ~ r ' )(1 ~' T3 ' ) 

o- n e{-x,+x}" 

to the partition function Zp of an LDPC code C. The dual 
code C is an LDGM with codewords given by r" where 



n 



(3) 



and u c are the m information bits (i and c will always 
refer to the variable and check nodes of the original LDPC 
Tanner graph and c £ i means that c is connected to i). A 
straigthforward application of the Poisson formula then yields 
an extended form of the Mac Williams identity, 

Zp = T^r-^Uh Zg (4) 



where 



E 



n(i+e- M «n««o 



u m e{-i,+x} m i=x 

This expression formaly looks like the partition function of 
an LDGM code (hence the subscript G) with "channel log- 



likelihoods" gi such that tanh g,; 



,-2/ 4 



This is truly the 



case for the BEC(e) where U = 0, +oo and hence gi = +oo, 
which still correspond to a BEC(1 — e). The logarithm of 
partition functions is related to the input-output entropy and 
one recovers (taking the e derivative) the well known duality 
relation between EXIT functions of a code and its dual on 
the BEC [7]. For other channels however this is at best a 
formal (but useful) analogy since the weights can be negative 
or equivalently the gi can assume complex values. 

We will need a duality formula for the correlations them- 
selves. We introduce a bracket (— )g which is not a true 
probabilistic expectation (but it is linear) 



(1) (f(u m ))c 



1 



E f(u m )Y[(i+e- 2li U u *) 

u m 6{-X, + l} m 



i=l 



The denominator may vanish, but it can be shown that when 
this happens the numerator also does so, and in a way that 
ensures the finiteness of the ratio (this will become quite clear 
in all our subsequent calculations). Taking logarithm of (O 
and then the derivative with respect to U we find 

(n)a 



1 



tanh 2li sinh 2Zj 
and differentiating once more with respect to lj, j ^ i 



(<JiGj)p - (ai)p(aj)p = 



(TjTj) G ~ (Tj) G (Tj)G 

sinh 21 j sinh 2L 



(5) 



(6) 



We stress that in (|5), T{ and Tj are given by products of 
information bits ([3). The left hand side of (|5]l is obviously 
bounded. It is less obvious to see this directly on the right 
hand side and here we just note that the pole at = is 
harmless since, for U = 0, the bracket has all its "weight" on 
configurations with r, = 1. Similar remarks apply to ©. In 



any case, we will beat the poles by using the following trick. 

For any < s < 1 and \x\ < 1 we have \x\ < \x\ s , thus 

Cp(i,j) < 2 1 - s E ln [\(a l a 3 ) P - (<J t ) P (a 3 ) P \ s } 
and using (O and Cauchy-Schwarz 

C P (iJ) < 2 1 - s E[(sinh2/)- 2s ] (7) 

The following bound 



E[(sinh20- 2s ]<^3^e- 



s(l-2s) 
Z2 



(8) 



on the prefactor turns out to be important in our analysis. Here 
< s < 5 and c > 0, d > are purely numerical constants. 

IV. Proof of Main Theorem 

From inequalities (0, ([8]) of the previous section we see 
that it suffices to prove that 

C G (i,j;s) = E^inr^G - (r i ) G (T i ) G | 3 '] 

decays. As explained in the introduction, the main tool used 
here is a cluster expansion of Berretti [5] (that has the ad- 
vantage of dealing simultaneously with the Griffith singularity 
phenomenon and at the same time does not use the positivity 
of the weights). Here we can only explain the resulting 
expansion, adapted to our setting, without giving the full 
derivation (a good starting point is [6]). We have 

(t^g (n)G(r 3 )G = \Y. K ^H^- 



where 



E E (• 

u (i) „(2) r compatible 



X 



.(1) 



(1) 



)\{E k 



her 



c£X 



withX 



and 



E, 



(9) 



Here and ui 2 ' are two independent copies of the in- 
formation bits (these are also known as real replicas) and 

r fc Q ^ = llcefc^fe^- T° ex pl am what are X and T we keep 
referring to checks and variables in the original LDPC Tanner 
graph language: checks are indexed by c and variables by i. 
Given a subset S of variable or ckeck nodes of the Tanner 
graph let dS be the subset of neighboring nodes. The sum 
over X is carried over clusters of check nodes such that: (i) 
X is "connected via hyperedges" (this means that X = dX for 
some connected subset X of variable nodes; X is connected 
if any pair of variable nodes can be joined by a path all of 
whose variable nodes lie in X) and (ii) X contains both the 
di and dj. T is a set of variable nodes (all distinct). We say 
that T is compatible with X if: (i) <9r U di U dj = X, (ii) 



dT n di 7^ <f> and dT n dj ^ <fi, (iii) there is a walk connecting 
di and dj such that all its variable nodes are in T. Finaly, 



u c all i s.t. 



Using \ J2i a i\ 2s < EJ a ;| 2s for < 2s < 1 and then 
Cauchy-Schwarz, we find 

C G {i,j;s)< \Y, T ^ X ) T ^ X ) 



x 



where 
and 



r 1 (X) 2 =E,.[|K,, j (X)| 1 «] 

r 2 (x)-E,[(^) 8 '] 



(10) 



(11) 



Bound on T\{X). Trivially bounding the spins in (O by 1 
we deduce (in the first inequality we need 4s < 1 and in the 
second 8s < 1) 



r| 



r compatible 
withX 



< 4 |jf| ^2 2(4S+1 



)|r | MkM|r| 



r compatible 
withX 



Now let us set s = jj and take e 2 < (10 In 2) 1 for simplicity. 



The bound becomes 

T 1 (l) 2 <4^l ]T e-i^ |r| 

r compatible 
withX 

If T is compatible with X we necessarily have \dT\ > \X\ — 
\di\ - \dj\ an since \dT\ < |r|l max , we get |r| > (|1| - 
2l m ax)/lmax- Also, the maximum number of variable nodes 
which have an intersection with X is |A|r max . Thus there are 
at most 2l x l r "™ x possible choices for T. These remarks imply 

T^X) 2 < 2 ( 2 + r -)l^l e -i^(l^l" 21 -)/ 1 - 

Bound on T^PO- The ratio (fTTI) is not easily estimated directly 
because the weights in Zq are not positive. However we can 
use the duality transformation (0]i to get a new ratio of partition 
functions with positive weights, 



z G {x c ) 

Z G 



exp 



E l - 

all i s.t 



C^jX^l Z P {X C ) 



with 



Z P (X C 



e n ^njd+ n *o 



°j allis.t 

dinX=ij> dic\X=4> 



c£X" 



i£c and 

dinx=(j> 



which is the partition function corresponding to the subgraph 
(of the full Tanner graph) induced by checks of X c and 
variable nodes i s.t di D X = <f>. Moreover C (X c ) is the dual 



of the later code C(X C ) defined on the subgraph. By standard 
properties of the rank of a matrix, the rank of the parity check 
matrix of C(X C ), which is obtained by removing rows (checks) 
and columns (variables) from the parity check matrix of C, is 
smaller than the rank of the parity check matrix of C. Thus 
\C{X C )\ > \C\ and ^{X^l < \C^\. Moreover 

(exp l)z P (X c )<Z P 

^ all i s.t ' 

To see this one must recognize that the left hand side is the 
sum of terms of Zp corresponding to er™ such that cr; = +1 
for di n X ^ 4> ( an d a U terms are > 0). These remarks imply 

for CEB 

T 2 (X'f < 1 

Now we can conclude the proof of theorem Q] From 
the bounds on ( fTOt and (fTTT i we get for e 2 < (l max (2 + 
r max )161n2)- 1 

c G (i,j;s X' ~ " 2 ' 

x 

The clusters X connect di and dj and thus have sizes 
\X\ > idist(i,j). Moreover the number of clusters of a given 
size grows at most like (l m axr max )' A '. Working out the final 
bounds, and putting them in a symmetrical form, the net result 
is that for dist(£,j) > 4l max we can find a purely numerical 
constant eo such that for e 2 < elk~ 2 (ln fc) _1 

C G (i,j;s = — ) < cie~7k dlst{hj) 

where k = (l max r max )^ and c\ and c-2 a strictly positive 
numbers. Using this bound with (O and (H)) concludes the 
proof of CD and ©. 

V. Exactness of Density Evolution 

In this section we illustrate an application of the theorem 
to the GEXIT function of standard irregular LDPC ensembles 
with degrees bounded by l max , r max . Let h n = ^H(X n \Y n ) 
be the input-output entropy. The MAP-GEXIT function is in 
general defined as 

d[e 2 ) 

Theorem 2 (Exactness of Density Evolution): One can find 
a strictly positive number t\ (in general smaller than the eo of 
theoremHJ such that for e 2 < e 2 fc~ 2 (ln fc) _1 

lim -7^E LD pc[/i„] = h lim E LDPC ,z[tanh(Z + A (d ')] - 1) 

where is the soft bit-estimate given by the density 

evolution analysis of the BP decoder. 

The proof of this theorem rests on the simple formula [12], 
[13] valid for the BIAWGN channel 

—±—E c [h n ] = ^(E LDPC , Jn [(cr }p] - 1) (12) 



where the variable node o is selected uniformly at random 
(the result is independent of the node due to symmetry). In 
this formula Efn [(<r )] is the MAP soft-bit estimate. 

In fact one can verify that the density evolution analysis 
is equivalent to performing statistical mechanical sums on a 
tree whose leaves are the spins (variable nodes) with free 
boundary conditions (channel outputs as initial conditions). 
More precisely if we call Nd(o) the neighborhood of depth d 
of o for d even (that is all the nodes of the Tanner graph that 
are at a distance < d from o) and consider the LDPC Gibbs 
measure (—}N d (o) restricted to the subgraph Nd(o), we can 
verify by explicit calculation that 

E LD pc.([tanh(Z + A (d) )] = E ldpc .^[(<t ) Nd{o) \N d (o) is a tree] 

d 

Now for d fixed, Nd(o) is a tree with probability 1 — 0(p—) 
where 7 depends only on the maximum node degrees, so 

EldpcH^o)^)] =E tDrc ,i[tanh(J+A^)]+0(2-) (13) 

Thus in view of ( fT2l the theorem will follow if we can show 
that 

Ein[(a ) P }=Ein[(a o ) Nd{o) ]+0(e-^) (14) 

with £ > and 0(e uniform in n and depending only 
on l max , r max . Indeed, if (fl4l holds, combining with dT3b we 
get 

Eldpc.(" [<<7o}p] =E LDPC . ( [tanh(Z + A«)] + O(^) 

n 

+ 0(e~^) 

and the theorem follows by taking first the limit n — ► +00 
and then d — > +00. 

Formula ( fT4l follows directly from the next two lemmas. 
Let Cd{o) denote the circle of variable nodes at distance = d 
from o. Call (— )t r , N the LDPC Gibbs measure associated 

\ / N d (o) 

to the graph Nd{o) with aj = +1 "boundary condition" for 
j G Cd(o). First we will show 

Lemma 1 (Cutting a piece of the Tanner graph): For e 2 < 

effc-^lnfc)- 1 

E l 4(a )p}=E ln [(a )+ d(o) )+0(e-^) 

where £ > and 0(e~^) depend only on l max ,r max . In 
particular they are independent of n. 

The second step is to show that for e small enough the 
soft estimate of the bit at o is independent from boundary 
conditions. 

Lemma 2 ( Independence from Boundary Conditions): 
Under the same conditions than in lemma Q] 

E*» [((To) Nd (o)] = E,« [(^)+ d(o) ] + 0(e"«£) 

Proof of Lemma \T\ We first introduce new interpolating 
Gibbs measures. Label the variable nodes in Cd{o) in some 
arbitrary order Cd(o) = {1,2, ...,7V} and assume these bits 



are transmitted through a BIAWGN channel with noise vector 
v N = (yi, i/jy) with < Vk < e (here v\ is the noise 
variance). Set xP = (0, 0, Vj, e, e) for j = 1, ...,N. 
The interpolating Gibbs measures (— )p are defined on the 
full Tanner graph with noise vectors xP for bits in Cd(0) 
and noise e for all other bits. A crucial remark is that for 
^ = (0,..,0) = 

^[Wo) V p = -]=E l 4(a )+ d{o) ] (15) 

Proceeding similarly to [3] we apply iteratively the fundamen- 
tal theorem of calculus, 

N e , 

^[(a ) P ]=E ln [(a ) u p N ^]+Y] / d Vj — E,n[((7 )?] 

j=1 Jo dVj 

For the BIAWGN channel we have the remarkable formula 
[13] 

^^E ln [(a ) P i ]=E ln [((a a j f; - {a Q ) P 3 (ajfj ) 2 ] 

Then using (TT3T > we obtain f/ie swot rwZe 

E i n[( CTo ) P ]=E i n[(a )+ d(o) ] 

Now we apply the generalized form of theorem Q] namely eq 
d2J (with possibly different numerical constants) 

Ez»[(<7 <7j)p - (er )p (<Tj)p] < cie "Fe - ^^ 
— 7 

Note that the prefactor e ^ is important in order to get 
convergent integrals in the sum rule. For the number of 
boundary terms we have N < k d which leads to the result 
of the lemma for e 2 < e 2 /c -2 (ln£;) -1 . 

Proof of Lemma [2] The proof is similar to that of Lemma Q] 
with (-)p replaced by {-) Nd ' y 

VI. Discussion 

Consider code ensembles such that the MAP-GEXIT curve 
has only one discontinuity at £map and vanishes for e < £map- 
Because of the perturbative nature of the cluster expansion 
our estimates for theorem Q] only work much below £map- 
What is the exact range of validity for the decay of the 
theorem is an open question. Let now 6bp be the Belief 
Propagation threshold. We know that theorem |2] cannot be 
valid for cbp < e < £map since in this range the BP and MAP 
estimates differ. In view of the sum rule in the proof of Lemma 
[T]this means that for this range the decay of correlations (even 
if exponential) cannot overcome the exponential growth of 
the number of nodes in Cd(o). An interesting question is to 
determine if the smallest for which this happens has a clear 
algorithmic significance and if it is in any way related to epp. 

Consider now the case of cycle codes, or of codes with 
sufficient fraction of degree two variable nodes (and no nodes 
of degree one), such that the GEXIT function is equal to zero 



for e < £map, is non zero for e > £map while it remains 
continuous at 6map (the curve may have a discontinuity at 
higher noise value e c ). Although in this case the statement of 
theorem [2] may be valid for some range of e above £map, our 
proof only works only below £map- This can be explicitly seen 
from Lemma|2]and the fact E;n [{co)% d ^ \ \^d{°) is a tree ] = 1 
which imply that our proof only works in a range were the 
GEXIT function vanihes. Our analysis is not powerful enough 
to capture any interesting behavior for the GEXIT function for 
£map << e < e c . 

Finally, consider the case of ensembles with some fraction 
of degree one nodes and a GEXIT function that does not 
vanish all the way down to e — > (with possibly a disconti- 
nuity at some e c ). An example is given by LDPC ensembles 
with Poisson degree distribution for variable nodes. Note that 
here E;n [(o'o)% d ( \\Nd(o) is a tree] ^ 1 because the tree still 
contains leaves (at distance < d from o) with free boundary 
conditions. In this case theorem |2] really captures a non trivial 
behavior of he GEXIT curve for small e. It extends to other 
channels previous results [10], [11] that had been obtained 
only for the BEC. This also proves that the replica solution is 
indeed correct for channels other than the BEC. 
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